06 - Files and modules
06 - Files and modules
Files
In Python, reading and writing file contents is very convenient. A function called open()
is used to get an object corresponding to a file, similar to C
language. Its arguments are the filename and file access mode (r
- read, w
- write (overwrites existing content), a
- appends at the end of the file, r+
- read and write):
= open('data.txt', 'r') # opens a file for reading
file_r = file_r.read() # reads the whole file and saves to a string variable
data # closes the file
file_r.close()
= open('new_data.txt', 'w') # opens a new file for writing
file_w # writes contents of a variable to a file
file_w.write(data) # closes the second file file_w.close()
The text can be easily split into a list of strings, each containing a single line:
file = open('data.txt', 'r')
= file.read().splitlines()
list_of_lines
for line in list_of_lines:
print(line)
file.close()
Very often data is written to text files, with variables delimited with a specific character, for example:
1;2;3;4
45;34;12;32;54;21
4;5;6332;23;2
In above example, consecutive numbers are separated with a semicolon (;
). The easiest way to separate a line into multiple elements is to use the split()
function. The function returns a list of strings. The example below will read the file line by line, and sum numbers in each of the lines.
file = open('numbers_to_sum.txt', 'r')
for line in file.read().splitlines() # for each line in file
= line.split(';') # split the line into a list of strings
numbers_strings # the strings have to be converted to numeric values
= [] # create an empty list
numbers
for nbr in numbers_strings: # for each element in the list of strings
int(nbr)) # convert the string to an integer and add to the list
numbers.append(
print(sum(numbers)) # sum all the integers
file.close()
10
198
6366
Modules
A lot of useful functionality is not built into the base interpreter, but available in the form of modules. A module is a library containing a specific set of functions (similar to C
/C++
libraries). Some of modules are included in the default Python installation, other have to be installed using pip package manager. It is also possible to write custom modules. Usually, modules are loaded at the beginning of the script, using import
statement, for example:
import os
import glob
# [...]
After importing a library, all its functionality is available under the object of the same name (library_name.function_name()
), for example os.chdir()
.
requests
module
The requests
is a quick and simple library for HTTP access (the protocol which is used by web browsers). To download and print the source code of a webpage, two commands are enough:
import requests
= requests.get("http://google.com")
req print(req.text)
Many services are available as web APIs, where the response can contain some requested information.
NumPy
NumPy is a popular library for convenient matrix operations, providing funcitonality very similiar to MATLAB.
Import the library at the beginning of the script using the following command:
import numpy as np
Above command differs from the previously used import
statements - the library is imported under the name np
to shorten the following code. This means all its functions will be accessed as np.function_name
, not numpy.function_name
. This convention (np
as an alias for numpy
) is universally used. It is strongly not recommended to create own abbreviation, as it leads to indecipherable code.
Creating matrices
The easiest way to create a NumPy matrix is to convert a Python list using np.array()
function. For 1-dimensional matrices (vectors) individial element access is the same as in Python lists. Remember that indexing, contrary to MATLAB, is 0-based:
= np.array([-1, 3.14, 0]) # creates a 1D matrix (vector)
a print("Dimensions:", a.shape)
print(a)
0] = 5
a[print(a)
Similarly, it is possible to create a 2D matrix based on a list of lists (rows):
= np.array([[10, 20, 30], [41, 51, 61]]) # creates a 2 by 3 matrix
b print("Dimensions:", b.shape)
print("Number of rows:", b.shape[0])
print("Number of columns:", b.shape[1])
print("Various elements elementy:", b[0, 0], b[0, 1], b[1, 0])
Commonly used types of matrices can be generated using following functions:
= np.zeros([2, 2]) # initialized with zeros
c print("zeros:")
print(c)
print()
= np.ones([1, 2]) # initialized with ones
d print("ones:")
print(d)
print()
= np.full([2, 2], 7) # initialized with a constant value
e print("full:")
print(e)
print()
= np.eye(4) # identity matrix
f print("eye:")
print(f)
print()
= np.random.random([2, 3]) # random values from uniform <0...1> range
g print("random:")
print(g)
The shape
attribute returns a tuple, but you can access its elements the same way as with lists, using square brackets.
Often its required to create a vector of uniformly distributed values. Two functions can be used for that purpose: np.arange(start, stop, step)
and np.linspace(start, stop, num)
:
= np.arange(10, 30, 5) # vector from 10 to 30 (right-open interval), with step of 5
x print("arange:", x)
print()
= np.linspace(0, 2*np.pi, 10) # vector from 0 to 2pi (closed interval), with 10 elements
y print("linspace:", y)
Basic math operations
With NumPy arrays, basic math operations (addition, subtraction, multiplication, division, exponentiation) are done with standard operators element-wise. Inpu matrices have to be of compatible sizes, the result is returned as a new array. Additionally, basic functions compatible with arrays, such as np.sin
, np.sqrt
are available. Full list of math routines can be found in the documentation: https://docs.scipy.org/doc/numpy/reference/routines.math.html
= np.array([20, 30, 40, 50])
a = np.arange(0, 4)
b = a - b
c print("subtraction:")
print(a - b)
print("exponetiation by a scalar:")
print(b**2)
print("value of function 10*sin(a):")
print(10*np.sin(a))
Contrary to MATLABa, *
operator used with numpy arrays performs the operation per-element To perform matrix multiplication, use @
operator:
= np.array([[1, 0],
A 0, 1]])
[= np.array([[2, 0],
B 3, 4]])
[
print("per-element multiplication:")
print(A * B)
print()
print("matrix multiplication:")
print(A @ B)
Data plotting - Matplotlib
Matplotlib is a popular library used to generate plots in Python. It is tightly correlated to NumPy, and its interface is close to the one used for plotting in MATLAB. Usually matplotlib.pyplot
is imported as plt
:
import matplotlib.pyplot as plt
Check the example below:
= np.linspace(0, 10, 1000)
x = x**2
y
# creates a new plot
plt.figure()
plt.plot(x, y)'x')
plt.xlabel('y')
plt.ylabel("y=x^2"])
plt.legend([# show the plot window plt.show()
You should achieve the following result:
Note that legend
function accepts a list of labels, even if only a single one is passed.
Contrary to MATLAB, consecutive plot
calls will not overwrite existing content. Formatting the plot is very similar to MATLAB syntax - you will find full descriptionin the library documentation: https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.plot.html
= np.linspace(0, 10, 1000)
x
plt.figure()20*np.sin(x), 'r')
plt.plot(x, 50], x[::50]**2, '^k')
plt.plot(x[::5*np.cos(x), '--g')
plt.plot(x, "y=20sin(x)", "y=x^2", "y=5cos(x)"])
plt.legend([ plt.show()
It is possible to pass a matrix as plot
y values, each column is treated as a separate value series:
plt.figure()30, 2)))
plt.plot(np.random.random(( plt.show()
And plotting figures with several subplots:
= np.linspace(-np.pi, np.pi, 100)
t
plt.figure()1, 2, 1)
plt.subplot(
plt.plot(t, np.sin(t))1, 2, 2)
plt.subplot(
plt.plot(t, np.cosh(t)) plt.show()
Final assignments 🔥 🔨
🔨 🔥 Files 🔥 🔨
- An input file with students’ test grades is given as an input: students.txt. Read the grades and calculate the final grade (average of test grades).
- Print surnames, names and final grades in the following format:
Doe Jane: 4.5
Jobs Dave: 2.0
Best Stephen: 2.0
🔨 🔥 Requests 🔥 🔨
Your dormitory roommate mines Bitcoin. Write a script for him, which will check current exchange rates for BTC with respect to common currencies. Use HTTP API available at https://blockchain.info/. The link to check a specific currency:
https://blockchain.info/tobtc?currency=USD&value=1
Check the output in a web browser.
The link is constructed as follows: https://blockchain.info/tobtc?currency=
+ currency_symbol + &value=
+ converted_amount. Your script should check the amount of BTC you can buy for 1 unit of the following currencies: USD, EUR, RUB, GBP, CHF, and print the output in console:
1 USD to BTC: 0.0001249
1 EUR to BTC: 0.00013677
1 RUB to BTC: 0.00000192
1 GBP to BTC: 0.0001536
1 CHF to BTC: 0.00012545
Avoid repetitive code, use loops where possible.
🔨 🔥 NumPy, Matplotlib 🔥 🔨
Create vector of x values from -5 to 5 (inclusive), with step of 0.1.
For those x arguments, calculate a series of Gaussian curves described by the following formula:
With several sets of parameters:
Plot the curves in a single figure.
Generate a legend label list automatically based on the parameter list. Add a legend to the figure.
Authors: Jakub Tomczyński, Tomasz Mańkowski